Discrete-to-deep reinforcement learning methods
نویسندگان
چکیده
Neural networks are effective function approximators, but hard to train in the reinforcement learning (RL) context mainly because samples correlated. In complex problems, a neural RL approach is often able learn better solution than tabular RL, generally takes longer. This paper proposes two methods, Discrete-to-Deep Supervised Policy Learning (D2D-SPL) and Q-value (D2D-SQL), whose objective acquire generalisability of network at cost nearer that method. Both methods combine supervised (SL) based on idea fast-learning method can generate off-policy data accelerate RL. D2D-SPL uses classifier which then used as controller for problem. D2D-SQL initialise allowed continue using another We demonstrate viability our algorithms with Cartpole, Lunar Lander an aircraft manoeuvring problem, three continuous-space environments low-dimensional state variables. least 38% faster baseline yield policies outperform them.
منابع مشابه
Massively Parallel Methods for Deep Reinforcement Learning
We present the first massively distributed architecture for deep reinforcement learning. This architecture uses four main components: parallel actors that generate new behaviour; parallel learners that are trained from stored experience; a distributed neural network to represent the value function or behaviour policy; and a distributed store of experience. We used our architecture to implement ...
متن کاملAsynchronous Methods for Deep Reinforcement Learning
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neura...
متن کاملAccelerated Methods for Deep Reinforcement Learning
Deep reinforcement learning (RL) has achieved many recent successes, yet experiment turnaround time remains a key bottleneck in research and in practice. We investigate how to optimize existing deep RL algorithms for modern computers, specifically for a combination of CPUs and GPUs. We confirm that both policy gradient and Q-value learning algorithms can be adapted to learn using many parallel ...
متن کاملEfficient Parallel Methods for Deep Reinforcement Learning
We propose a novel framework for efficient parallelization of deep reinforcement learning algorithms, enabling these algorithms to learn from multiple actors on a single machine. The framework is algorithm agnostic and can be applied to on-policy, off-policy, value based and policy gradient based algorithms. Given its inherent parallelism, the framework can be efficiently implemented on a GPU, ...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Computing and Applications
سال: 2021
ISSN: ['0941-0643', '1433-3058']
DOI: https://doi.org/10.1007/s00521-021-06270-6